442. Find All Duplicates in an Array
Medium

Given an integer array nums of length n where all the integers of nums are in the range [1, n] and each integer appears once or twice, return an array of all the integers that appears twice.

You must write an algorithm that runs in O(n) time and uses only constant extra space.

 

Example 1:

Input: nums = [4,3,2,7,8,2,3,1]
Output: [2,3]

Example 2:

Input: nums = [1,1,2]
Output: [1]

Example 3:

Input: nums = [1]
Output: []

 

Constraints:

  • n == nums.length
  • 1 <= n <= 105
  • 1 <= nums[i] <= n
  • Each element in nums appears once or twice.
Accepted
280.5K
Submissions
403K
Seen this question in a real interview before?
Companies
Related Topics

Average Rating: 4.56 (34 votes)

Premium

Solution


Approach 1: Brute Force

Intuition

Check for a second occurrence of every element in the rest of the array.

Algorithm

When we iterate over the elements of the input array, we can simply look for any other occurrence of the current element in the rest of the array.

Since an element can only occur once or twice, we don't have to worry about getting duplicates of elements that appear twice:

  • Case - I: If an element occurs only once in the array, when you look for it in the rest of the array, you'll find nothing.
  • Case - II: If an element occurs twice, you'll find the second occurrence of the element in the rest of the array. When you chance upon the second occurrence in a later iteration, it'd be the same as Case - I (since there are no more occurrences of this element in the rest of the array).

Complexity Analysis

  • Time complexity : O(n2)\mathcal{O}(n^2).

    For each element in the array, we search for another occurrence in the rest of the array. Hence, for the ithi^{{th}} element in the array, we might end up looking through all nin - i remaining elements in the worst case. So, we can end up going through about n2n^2 elements in the worst case.

    n1+n2+n3+....+1+0=1n(ni)n2n-1 + n-2 + n-3 + .... + 1 + 0 = \sum_{1}^{n}(n-i) \simeq n^2

  • Space complexity : No extra space required, other than the space for the output list.



Approach 2: Sort and Compare Adjacent Elements

Intuition

After sorting a list of elements, all elements of equivalent value get placed together. Thus, when you sort an array, equivalent elements form contiguous blocks.

Algorithm

  1. Sort the array.
  2. Compare every element with it's neighbors. If an element occurs more than once, it'll be equal to at-least one of it's neighbors.

To simplify:

  1. Compare every element with its predecessor.
    • Obviously the first element doesn't have a predecessor, so we can skip it.
  2. Once we've found a match with a predecessor, we can skip the next element entirely!
    • Why? Well, if an element matches with its predecessor, it cannot possibly match with its successor as well. Thus, the next iteration (i.e. comparison between the next element and the current element) can be safely skipped.

Complexity Analysis

  • Time complexity : O(nlogn)+O(n)O(nlogn)\mathcal{O}(n \log{n}) + \mathcal{O}(n) \simeq \mathcal{O}(n \log{n}).

    • A performant comparison-based sorting algorithm will run in O(nlogn)\mathcal{O}(n \log{n}) time. Note that this can be reduced to O(n)\mathcal{O}(n) using a special sorting algorithm like Radix Sort.

    • Traversing the array after sorting takes linear time i.e. O(n)\mathcal{O}(n).

  • Space complexity : No extra space required, other than the space for the output list. Sorting can be done in-place.



Approach 3: Store Seen Elements in a Set / Map

Intuition

In Approach 1 we used two loops (one nested within the other) to look for two occurrences of an element. In almost all similar situations, you can usually substitute one of the loops with a set / map. Often, it's a worthy trade-off: for a bit of extra memory, you can reduce the order of your runtime complexity.

Algorithm

We store all elements that we've seen till now in a map / set. When we visit an element, we query the map / set to figure out if we've seen this element before.

Complexity Analysis

  • Time complexity : O(n)\mathcal{O}(n) average case. O(n2)\mathcal{O}(n^2) worst case.

    • It takes a linear amount of time to iterate through the array.
    • Lookups in a hashset are constant time on average, however those can degrade to linear time in the worst case. Note that an alternative is to use tree-based sets, which give logarithmic time lookups always.
  • Space complexity : Upto O(n)\mathcal{O}(n) extra space required for the set.

    • If you are tight on space, you can significantly reduce your physical space requirements by using bitsets [1] instead of sets. This data-structure requires just one bit per element, so you can be done in just nn bits of data for elements that go up-to nn. Of course, this doesn't reduce your space complexity: bitsets still grow linearly with the range of values that the elements can take.


Approach 4: Mark Visited Elements in the Input Array itself

Intuition

All the above approaches have ignored a key piece of information in the problem statement:

The integers in the input array arr satisfy 1 ≤ arr[i] ≤ n, where n is the size of array. [2]

This presents us with two key insights:

  1. All the integers present in the array are positive. i.e. arr[i] > 0 for any valid index i. [3]
  2. The decrement of any integers present in the array must be an accessible index in the array.
    i.e. for any integer x in the array, x-1 is a valid index, and thus, arr[x-1] is a valid reference to an element in the array. [4]

Algorithm

  1. Iterate over the array and for every element x in the array, negate the value at index abs(x)-1. [5]
    • The negation operation effectively marks the value abs(x) as seen / visited.

Pop Quiz: Why do we need to use abs(x), instead of x?

  1. Iterate over the array again, for every element x in the array:
    • If the value at index abs(x)-1 is positive, it must have been negated twice. Thus abs(x) must have appeared twice in the array. We add abs(x) to the result.
    • In the above case, when we reach the second occurrence of abs(x), we need to avoid fulfilling this condition again. So, we'll additionally negate the value at index abs(x)-1.

Pop Quiz: Can you do this in a single loop?

Definitely! Notice that if an element x occurs just once in the array, the value at index abs(x)-1 becomes negative and remains so for all of the iterations that follow.

  1. Traverse through the array. When we see an element x for the first time, we'll negate the value at index abs(x)-1.
  2. But, the next time we see an element x, we don't need to negate again! If the value at index abs(x)-1 is already negative, we know that we've seen element x before.

So, now we are relying on a single negation to mark the visited status of an element. This is similar to what we did in Approach 3, except that we are re-using the array (with some smart negations) instead of a separate set.

Complexity Analysis

  • Time complexity : O(n)\mathcal{O}(n). We iterate over the array twice. Each negation operation occurs in constant time.

  • Space complexity : No extra space required, other than the space for the output list. We re-use the input array to store visited status.



  1. C++ provides an excellent std::bitset in the standard library. ↩︎

  2. Some readers will notice a similarity with the pigeonhole principle. While this doesn't really come into play in Approach 4, we utilized it indirectly in Approach 3: since some elements appear twice, the number of unique elements is less than the size of the array. If every unique element gets a bucket in our map / set, some buckets are bound to have more than one element in them! ↩︎

  3. Because, arr[i] >= 1 for any valid index i of array arr. ↩︎

  4. Because, all elements in the array are integers that lie in the range [1,n][1, n] (where nn is length of the array). Thus, their decrements are integers that lie in the range [0,n1][0, n-1] (which is precisely the set of valid indices for an array of length nn). ↩︎

  5. The abs() function provides the absolute value. ↩︎

Report Article Issue

Comments: 30
prerna-p's avatar
Read More

Pop Quiz: Why do we need to use abs(x), instead of x?

Here's a crazy idea, how about you explain this since this is a solution article lol?

156
Show 12 replies
Reply
Share
Report
shailpanchal2005's avatar
Read More

For those who didn't understand solution 4,
You'll get an idea after looking at solution of https://leetcode.com/problems/find-all-numbers-disappeared-in-an-array/

9
Reply
Share
Report
vminstance's avatar
Read More

The last approach is a clever one. Here is another idea - if we're OK with modifying the input array itself, we could leverage the fact that each element can be placed on the "right" spot only once.
I.e. (assuming 1-based indexing) if there would be nums[1] = 1 then any other 1 element in the array would be identified as a duplicate.

So the idea is to keep moving elements to their spots by swapping i-th element with nums[nums[i] - 1] (here index is already zero based).
Whenever we notice that that other element (nums[nums[i] - 1]) is already on it's place, it means we're seeing n[i] as a duplicate.
And there would be no more than n/2 swaps required to get all the elements at their right spots. A code snippet is below:

// int[] nums - is an incoming parameter

// Result list of duplicated values.
var output = new List<int>();
for(int i = 0; i < nums.Length; i++) {
   
   // Here nums[i] > 0 means that i-th element hasn't been marked as a duplicate, for that we use 0 as a placeholder.
  // Expected value on i-th place is i + 1, because index is zero based.
   while(nums[i] > 0 && nums[i] != i + 1) { 
      SwapAndRegisterDups(nums, i, output);
   }
}

return output;

And then helper function SwapAndRegisterDups simply performs swap of two elements at i and nums[i] - 1 indexes or adds nums[i] to the resulted set in case if nums[nums[i] - 1] is already on it's place.

var first = nums[i];
var second = nums[nums[i] - 1];
if (second == nums[i]) {
  // Register nums[i] as dupe and set it to 0
}
else{
  // Swap first and second.
}

8
Show 4 replies
Reply
Share
Report
ACSAIS's avatar
Read More

Approach 5 using Counting sort
We know that 1 ≤ a[i] ≤ n (n = size of array). So we can just find out how many times each value appears in the array and then add to result those elements that appear twice.

class Solution {
    public List<Integer> findDuplicates(int[] nums) {
        int[] counter = new int[nums.length + 1];
        List<Integer> result = new ArrayList();
        for (int i = 0; i < nums.length; i++) {
            counter[nums[i]]++;
        }
        for (int i = 0; i < counter.length; i++) {
            if (counter[i] > 1) result.add(i);
        }
        return result;
    }
}

5
Show 1 reply
Reply
Share
Report
ebo-bo's avatar
Read More

output still uses extra space no? maybe specify that too?

4
Show 1 reply
Reply
Share
Report
ltbtb_rise's avatar
Read More

such a brain teaser for solution 4, it is really smart though!

1
Reply
Share
Report
ncstguy's avatar
Read More

If the elements of the array fall within some fixed range, eg: between 0 and 100, a fifth way to detect duplicates using O(1) space and O(N) time is to concatenate two 64 bit integers side by side, treat it like a single 128 bit number and turn on the k'th bit to indicate if the number k was seen already in the array. It only works for small ranges though.

1
Show 1 reply
Reply
Share
Report
roddamramesh's avatar
Read More

Using Cyclic Sort in O(n) and O(1) space and beats 100%

class Solution {
    public List<Integer> findDuplicates(int[] nums) {
        List<Integer> lst= new ArrayList<Integer>();
        for(int i=0;i<nums.length;i++){
            while(nums[i]!=nums[nums[i]-1]){
                int temp = nums[nums[i]-1];
                nums[nums[i]-1]= nums[i];
                nums[i]= temp;
                
            }
        }
        
        int j=0;
        while(j<nums.length){
            
            if(nums[j]!=j++){
              lst.add(nums[j]);  
            } 
            j++;
            
        }
        return lst;
    }
}

0
Show 2 replies
Reply
Share
Report
nitinsharma1988's avatar
Read More

def findDuplicates(self, nums):
    i = 0
    duplicates = set()
    while i < len(nums):
        if nums[i] != i+1:
            j = nums[i]-1
            if nums[i] != nums[j]:
                nums[i], nums[j] = nums[j], nums[i]
            else:
                duplicates.add(nums[i])
                i += 1
        else:
            i += 1
    return duplicates

0
Show 1 reply
Reply
Share
Report
racyhippo's avatar
Read More

Maybe I missed something, but none of the solution is both O(n) AND with no extra space.
Hash/set would need O(n) space. Bitset (mark visited) is O(n) space too. Sort and check duplicates is O(n.log(n)).

0
Show 2 replies
Reply
Share
Report
Time Submitted
Status
Runtime
Memory
Language
08/07/2020 08:21Accepted128 ms34 MBcpp
07/28/2020 18:59Accepted156 ms33.7 MBcpp
06/23/2020 15:46Accepted196 ms46 MBcpp
/23
Contribute
|||
Saved
Trie
#208 Implement Trie (Prefix Tree)
Medium
#211 Design Add and Search Words Data Structure
Medium
#212 Word Search II
Hard
#336 Palindrome Pairs
Hard
#421 Maximum XOR of Two Numbers in an Array
Medium
#425 Word Squares
Hard
#472 Concatenated Words
Hard
#642 Design Search Autocomplete System
Hard
#648 Replace Words
Medium
#676 Implement Magic Dictionary
Medium
#677 Map Sum Pairs
Medium
#692 Top K Frequent Words
Medium
#720 Longest Word in Dictionary
Easy
#745 Prefix and Suffix Search
Hard
#1023 Camelcase Matching
Medium
#1032 Stream of Characters
Hard
#1065 Index Pairs of a String
Easy
#1638 Count Substrings That Differ by One Character
Medium
#1698 Number of Distinct Substrings in a String
Medium
#1707 Maximum XOR With an Element From Array
Hard
#1803 Count Pairs With XOR in a Range
Hard
#1804 Implement Trie II (Prefix Tree)
Medium
#1858 Longest Word With All Prefixes
Medium